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Abstract 

The human gut microbiome consists of at least 3 million non-redundant genes, 150 times that of the core human genome. 
Herein, we report the identification and characterisation of a novel stress tolerance gene from the human gut metagenome. 
The locus, assigned brpA, encodes a membrane protein with homology to a brp/fa/h-family p-carotene monooxygenase. 
Cloning and heterologous expression of brpA in Escherichia coli confers a significant salt tolerance phenotype. Furthermore, 
when cultured in the presence of exogenous p-carotene, cell pellets adopt a red/orange pigmentation indicating the 
incorporation of carotenoids in the cell membrane. 
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Introduction 

Metagenomics provides a culture-independent means to access 
and study the genetic content of all of the microorganisms in a 
particular environmental niche. Metagenomic analysis can be 
sequence-based or functional (or a combination of both). Tlie 
development of faster, cheaper and more accurate next-generation 
sequencing (NGS) technologies has allowed new insights into 
microbial community structure and diversity and has led to the 
discovery of many novel genetic loci [1-4]. Functional metage- 
nomics has also been utilised to identify many novel functions 
through cloning and heterologous expression of metagenomic 
DNA and subsequent phenotypic detection of a desired trait 
conferred on the cloning host. Some notable examples include 
genes encoding proteins of industrial, pharmaceutical and medical 
relevance such as lipases, esterases and novel antibiotics [5-8]. 

The human gut microbiome has become perhaps the most 
intensively studied environment using metagenomics [9,10]. 
Collectively, there are at least 150 times as many genes in the 
human gut microbiome than there are human genes in the 
genome, a large proportion of which are uncharacterised [11]. 
The ability to respond and adapt to external environmental 
stresses is key to microbial survival and it is possible to use 
metagenomics to identify novel mechanisms that enable such 



survival [12]. In the gastrointestinal (GI) tract microorganisms are 
faced with numerous challenges such as low pH, low iron 
concentrations, increased osmolarity, bile, immunity mechanisms 
and competing microbes [13,14]. Different sets of genes are 
activated in response to environmental cues [15]. Work in our lab 
is focused on genes that confer increased tolerance to osmotic 
stress [16]. The response to osmotic stress is broad and 
encompasses many diverse cellular processes and systems [17]. 
Metagenomics makes it possible to identify novel systems 
unrelated to the classical (and comprehensively studied) primary 
and secondary responses of potassium (K"^ uptake and osmopro- 
tectant utilisation [18-20]. We have previously identified a 
number of novel salt tolerance loci from the human gut microbiota 
using a combination of functional metagenomic screening, next- 
generation sequencing and bioinformatic analyses [21-23]. 

In this study we report the identification of a novel salt tolerance 
gene from a human gut metagenomic library we have previously 
screened [22]. An in silico analysis revealed the gene (which we 
have termed brpA) encoded a putative carotenoid modifying 
enzyme with homology to a brp / blh-iamAy P-carotene 15,15'- 
moiiooxygeiiase protein, which cleaves P-carotene to two mole- 
cules oi all-trans retinal (vitamin A aldehyde) [24,25]. Finally, we 
demonstrate that hrpA confers an increased salt tolerance 
phenotype when heterologously expressed in Escherichia coli. 
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Materials and Methods 

Bacterial strains and growth conditions 

Bacterial strains and plasmids used in this study are listed in 
Table 1 . Oligonucleotide primers (synthesised by Eurofins, MWG 
Opcron, Germany) are presented in Table SI. E. coli 
EPI300::pCGlFOS (Epicentre Biotechnologies, Madison, WI, 
USA) was cultured in Luria-Bertani (LB) medium containing 
12.5 |J.g/ml chloramphenicol (Gm) and in 12.5 jJ-g/ml chloram- 
phenicol plus 50 |J,g/ml kanamycin (Kan) following EZ-Tn5 
transposon mutagenesis. E. coli MKH13 was grown in LB and 
LB supplemented with 20 \lg/ ml Cm for strains transformed with 
the plasmid pCI372. E. coli strains containing the pBAD 
expression vector were cultured in the presence of 100 M-g/™l 
ampicUlin. 

For growth in minimal media, strains were grown in M9 (Fluka) 
minimal salts supplemented with final concentrations of 0.4% 
glucose, 0.2% casamino acids, 2 mM magnesium sulphate 
(MgS04) and 0.1 mM calcium chloride (CaGl2). When required, 
stock solutions of P-carotene were added to media at a final 
concentration of 20 |j,M. Growth media was supplemented with 
1.5% agar for plate assays. All overnight cultures were grown with 
shaking at 37°G. 

Construction and screening of the human gut 

metagenomic library 

A previously constructed fosmid clone librar)', created from 
metagenomic DNA from the human gut microbiome [28] was 
used to screen for salt-tolerant clones. The library was screened 
using the protocol oudined by Culligan et al [22]. Briefly, a total of 
23,040 librar\r clones were screened on LB agar supplemented 
with 6.5% (w/v) NaCl using a Genetix QPix 2 XT™ colony 
picking/gridding robotics platform. Plates were incubated at 37°C 
for 2-3 days and checked periodically for growth of likely salt- 
tolerant clones. 

Sequencing and bioinformatic analysis 

The fosmid insert from clone SMG 6 was fully sequenced and 
assembled by GATG Biotech (Konstanz, Germany) using the GS- 
FLX 454 pyrosequencing (Roche) platform on a Titanium mini- 
run. The fuU sequence of SMG 6 can be found in GenBank under 
the accession number JQ269599.1. Putative open reading frames 
were predicted using Softberry FGENESB bacterial operon and 
gene prediction software (www.softberry.com) and also GeneMark 
[29]. Retrieved nucleotide and translated amino acid sequences 
were functionally annotated by homology searches using the Basic 
Local Alignment and Search Tool (BLAST) to identify homolo- 
gous sequences from the National Centre for Biotechnology 
Information (NGBI) website: http://www.ncbi.nlm.nih.gov/blast/ 
Blast.cgi. The following databases and tools were used to gain 
additional information on the BrpA protein: Conserved Domain 
Database (CDD), PROSITE motif search, SignalP 4.0, HMMER, 
TMHMM, HHPred, and Softberry BProm promoter search 
(www.softberry.com) [30-36]. 

The Fold and Functional Assignment System (FFAS03) is a 
profik'-profile and fold recognition algorithm that can detect 
remot(; homology bct^\xx"n proteins [37]. Profile-profile compar- 
isons have increased sensitivity compared to sequence-sequence or 
profile-sequence algorithms. FFAS03 searches numerous databas- 
es including non-redundant (nr) NGBI, Global Ocean Sampling 
(GOS) from JC VI, PDB, SCOP, and COG, as well as numerous 
metagenome datasets including MetaHit [1 1] which contains over 
3 million unique genes from the human gut microbiome. The 
BrpA protein sequence was submitted to the server to identify 



proteins with homology based on FFAS profiling or sequence 
homology by BLAST and PSI-BLAST against the databases and 
metagenome datasets. The FFAS03 server can be found at: 
http:// flFas.burnham.org/ flfas-cgi/ cgi/ document.pl. 

The Integrated Microbial Genomes and Metagenomes (IMG/ 
M) [38] is a data management system for the comparative analysis 
of metagenome sequence data. IMG/M-HMP [39] specifically 
contains metagenome data from the Human Microbiome Project 
(HMP) [40] . It contains 748 metagenome datasets generated from 
sequencing samples from diflTerent body sites and also, tools for 
comparative analysis between hosted sequences and user supplied 
sequences. The BrpA protein sequence was used a query sequence 
to BLAST (le-''^ and le-''" maximum e-value cut-off) against all 
the available metagenomes from 17 body sites from the HMP 
dataset. The IMG/M-HMP server can be found at: http://www. 
hmpdacc-resources.org/cgi-bin/imgm_hmp/main.cgi. 

DNA manipulations and cloning 

Induction of fosmids from LOW to high copy number was 
performed as per the manufacturer's instructions. The Qiagen 
QIAprep Spin mini-prep kit was used to extract fosmids using the 
protocol outlined by manufacturer. The hrpAj^, hrpAg and 
brpAatfA genes were amplified using ReddyMix PGR mastermix 
(Thermo Scientific). PGR products were purified with a Qiagen 
PGR purification kit and digested with restriction enzymes Xbal 
and PstI (Roche Applied Science), followed by ligation using the 
Fast-Link DNA ligase kit (Epicentre Biotechnologies) to similarly 
digested plasmid pCI372. Electro-competent £. coli MKH13 were 
transformed with the ligation mixture and plated on LB agar 
plates containing 20 |J.g/ml Cm for selection. 

The pBAD TOPO TA expression kit (Invitrogen, Carlsbad CA, 
USA) was used to clone the PGR products into the pBAD 
expression vector according to the manufacturer's instructions. 
The hrpAi^, hrpAs and hrpAalfA genes were amplified as outlined 
above. The resulting plasmids, containing the genes of int(;r(;st 
were electroporated into freshly competent E. coli EPI300 and 
plated on LB agar containing 100 |ig/ml of ampiciUin. 

Colony PGR was performed on resistant transformants using a 
gene and plasmid (pCI372 or pBAD) specific primer combination 
to confirm the presence and size of the insc'rt. Insc'rts were 
sequenced to confirm the correct nucleotide sequence (GATG 
Biotech, Germany). 

Growth experiments 

Cultures were grown overnight in the relevant media (LB or M9 
broth). Cells were subsecjuently har\'ested, washed in one quarter 
strength sterile Ringer's solution and re-suspended in fresh media. 
A 2% (v/v) inoculum was sub-cultured in fresh broth containing 
sodium chloride (NaCl), and 200 ^l was transferred to a sterile 96- 
well micro-titer plate (Starstedt Inc. Newton, USA). For minimal 
media experiments, filter-sterilised stock solutions of the osmopro- 
tectants betaine, L-carnitine and L-prolrne were added to a final 
concentration of 1 mM. Micro-titer plates were incubated at 37°C 
for 24-4^8 hours in an automated spectrophotometer (Tecan 
Genios) which recorded the OD 595nm every hour. The data was 
subsequently retrieved and analysed using the Magellan 3 software 
program. 

Survival in high salt media in the presence and absence of 
20 |J,M P-carotene was assessed by harvesting overnight cultures as 
above and sub-culturing in either '.V'/o NaCl or 7% NaCl for 
MKH13 and EPI300 strains respectively. Cultures were incubated 
at 37°C both aerobically (with shaking) and anaerobically (static) 
for 48 hours. Subsequentiy, serial dilutions of cultures were made 
in one quarter strength sterile Ringers solution and plated on LB 
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Table 1. Bacterial strains and plasnnids. 





strain, plasmid or transposon 


Genotype or characterlstic(s) 


Source or reference 


Strains 


£ coli EPI300 


mcrA Aimrr-tisdRMS-mcrBC) "DSOd/acZAMlS AlacX74 recA^ 
endA} araD139 h(ara, teu)7697 galU galK rpsL nupG trfA dhfr, 
high-transformation efficiency of large DNA 


Epicentre Biotechnologies, 
Madison, Wl, USA 


SMG 6 


EPI300 containing pCClFOS fosmid with —34 l<b of metagenomic 
DNA from human gut microbiome 


This study 


SMG 6-EZTn5 #24 


Transposon insertion in gene 24 {which precedes acyltransferase gene atfA) 


This study 


SMG 6-EZTn5 #26 


Transposon insertion in brpA gene 


This study 


SMG 6-EZTnS #34 


Transposon insertion in acyltransferase gene (atfA) 


This study 


SMG 6-EZTn5 #38 


Transposon insertion in brpyA gene 


This study 


E.coli MKH13 


MC4100z((pufP/A)101D(proP)2D(proU) 


[26] 


£ coli MKH13::pCI372 


MKH13 containing empty pCI372 plasmid 


This study 


£ coli MKH13::pCI372-bf-p/As 


MKH13 containing pCI372 with brpAs gene from SMG 6; "S" subscript 
denotes shorter predicted gene with TTG start codon 


This study 


£ coli MKH13::pCI372-farp,At 


MKH13 containing pCI372 with brpA^ gene from SMG 6; "L" subscript 
denotes longer predicted gene with ATG Start codon 


This study 


£ coli MKH13::pCI372-bf-p,Aatf/l 


MKH13 containing pCI372 with brpAatfA genes from SMG 6 


This study 


£ coli EPI300::pBAD-brp^s 


EPI300 containing pBAD with brpAs gene from SMG 6 


This study 


£ coli MKH13::pCI372-bf-p/1t 


MKH13 containing pCI372 with brpA^ gene from SMG 6; "L" subscript 
denotes longer predicted gene with ATG Start codon 


This study 


£ coli MKH13::pCI372-brp/latM 


MKH13 containing pCI372 with brpAatfA genes from SMG 6 


This study 


£ coli EPI300::pBAD-brp^s 


EPI300 containing pBAD with brpAs gene from SMG 6 


This study 


£ coli EPI300::pBAD-bfp^t 


EPI300 containing pBAD with brpA^^ gene from SMG 6 


This study 


£ coli EPI300::pBAD-bfp/lafM 


EPI300 containing pBAD with brpA and atfA genes from SMG 6 


This study 


Plasmids 


pCI372 


Shuttle vector between £ coli and L. lactis, Cm^ 


[27] 


pCCIFOS 


Fosmid cloning vector, Cm" 


Epicentre Biotechnologies, 
Madison, Wl, USA 


pBAD 


L-arabinose inducible expression vector, Amp'^ 


Invitrogen, USA 


Transposon 


EZ-TnJ<or/V/KAN-2> 


Hyperactive Tn5 transposon, Kan'^, inducible high copy 
origin of replication - or/V 


Epicentre Biotechnologies, 
Madison, Wl, USA 



Cm^ Kan'^ and Amp'^ = chloramphenicol, kanamycin and ampicillin resistance respectively. 
doi:l 0.1 371/journal.pone.010331 S.tOOl 



agar. Viable cells were enumerated and calculated as the number 
of colony forming units per mUlilitre (CFU/ml). 

Graphs (created using SigmaPlot 10.0) are presented as the 
average of triplicate experiments, with error bars being represen- 
tative of the standard error of the mean (SEM). 

Transposon mutagenesis 

Transposon mutagenesis was carried out on SMG 6 using the 
EZTn-5<oriV/KAN-2> in vitro transposition kit (Epicentre 
Biotechnologies) in accordance with the manufacturer's instruc- 
tions. E. coli EPI300 cells were transformed with the transposon 
reaction mixture and selected on plates containing Cm and Kan 
(12.5 and 50 |Xg/ml, respectively). Transposon insertions in the 
regions of interest were confirmed by PGR. Regions containing 
the EZTn5 transposon are approximately 1.9 kb larger than the 
region covered by the primers. PGR products of the correct size 
were sequenced from the ends of the transposon using the primers 
EZTn FP-1 and RP-1 (Table SI) to confirm the location of 
transposon insertion. All sequencing was performed by GATG 
Biotech (Germany). 



Results 

Screening the human gut metagenomic library 

Fifty-three salt-tolerant clones were identified from a screen of 
approximately 23,000 fosmid library clones. The clones were 
annotated as SMG (for Salt MetaGenome) 1-53. Six clones grew 
within 24 hours (SMG 1-6) and the remaining 47 grew over the 
following 24—48 hours. The focus of this study were clones SMG 1 
and SMG 6, both of which were found to contain the same insert. 
SMG 6 was chosen for further analysis. Previous work has focused 
on clones SMG 3 and SMG 5 and SMG 25 [22,23]. End 
sequencing revealed that another clone, SMG 52, shared the same 
sequences at the 5 ' and 3 ' ends of the fosmid as SMG 1 and SMG 
6. Furthermore, SMG 52 displayed a similar growth profile to 
SMG 1 and 6 when grown under sodium chloride (NaCl) stress 
and all three clones have a significant (P<0.0001 for all clones) 
growth advantage in the presence of 7 % added NaCl compared to 
the EPI300 host strain carrying the empty fosmid vector 
(pCClFOS) (Figure IB). No difference in growth between any of 
the clones was observed in LB alone (Figure lA). Further 
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LB 



LB + 7% NaCI 




Time/ hrs 



0 EPI300::pCC1FOS 
-y- SMG 1 
-H- SMG 6 

^>- SMG 52 



1 1 r 

20 30 40 
Time/ hrs 



50 



Figure 1. Growth of metagenomic clones SMG 1, 6 and 52 compared to EPI300 carrying an empty fosmid vector (pCCIFOS) in (A) 
LB brotli and (B) LB broth supplemented with 7% NaCI. 

doi:1 0.1 371 /journal.pone.01 0331 S.gOOl 



investigation involving pyrosequencing revealed SMG 52 con- 
tained the same insert as SMG 1 and SMG 6. 

Fosmid sequencing and bioinformatic analysis 

The fosmid inserts from SMG 1, 6 and 52 were fuUy sequenced 
and assembled by GATC Biotech (Germany) using the GS-FLX 
Titanium mini run. AU three inserts were found to be identical, 
sharing 100% nucleotide identity over the entire length of the 
fosmid insert (~34 kb). Gene prediction using FGENESB 
predicted the presence of thirty putative open reading frames 
(see Table 2). Translated nucleotide sequences were subjected to 
BLASTP (maximum e-value cut-off of le-°'') analysis to identify 
homologous sequences in the database. The vast majority 
corresponded to proteins from the Gram-negative Bacteroidetes 
phylum, with amino acid identities ranging from 26% to 100%. 
Proteins with between 99%- 100% amino acid identity corre- 
sponded to three species of Bacteroides, namely Bacteroides 
thetaiotaomkron VPI-5482, Bacteroides sp. 1_1_6 and Bacteroides 
sp. 1_1_14. The remainder corresponded to other members of the 
phylum Bacteroidetes from genera Alistipes, Prevotella and 
Odoribacter, as well to Gram-positive Firmicutes from the family 
Lachnospiraceae and genera Clostridium and Veillonella. 

Functional assignment of the encoded proteins on SMG 6 based 
on homology searches using BLASTP revealed that gene 26 was 
predicted to encode a putative membrane protein, although none 
of the potential homologues identified shared greater than 30% 
amino acid identity (placing them in the "twilight zone" of 
evolutionary relatedness). This protein also shared sequence 
similarity with a brp/hlh-fmnily 15,15'-P-carotene monooxygenase 
from Prevotella marshii DSM 16973 (28% identity over 254 
amino acids) and with a proline symporter from Bifidobacterium 
bifidum BGN 4 (25% identity over 222 amino acids). Given that 
proline is an important osmoprotectant utilised by bacteria to 
counteract the deleterious effects of salt-induced osmotic stress 



[41,42], we elected to pursue this gene, which we have named 
brpA, for further study. 

Features of SMG 6 and brpA/BrpA 

The hrpA gene is number 26 of the 30 predicted genes on SMG 
6 (Fig. 2). It is predicted to be a lone open reading frame, preceded 
by and followed by a seven and a four gene operon, respectively. It 
is flanked upstream and downstream by a number of genes 
predicted to encode proteins with acetyl-, acyl- or glycosyl- 
transferase activities. There are indications that brpA and a 
number of adjacent genes have been acquired through lateral gene 
transfer (LGT). The SMG 6 fosmid insert is -34.26 kb and its 
overall %G+C content is 41.92%. The highest genetic identities of 
a large proportion of the genes are to Bacteroides species, with up 
to 100% identity in some cases. The %G+C content of genus 
Bacteroides ranges from 40-48%, with B. thetaiotaomicron VPT 
5482, Bacteroides sp. 1_1_6 and Bacteroides sp. 1_1_14 all having 
a %G+C content of approximately 43% (Genomes Online 
Database, GOLD; http://www.genomesonline.org/). The %G+ 
C content of the genes on the SMG 6 fosmid insert is illustrated in 
Figure 2A. Genes in the first half of the insert, up to and including 
gene 16, have a %G+C content of ~45%; similar to the average 
%G+C content of the genus Bacteroides. The second half of the 
insert displays a clear drop in %G+C content to ~37%. The %G-I- 
C content of some individual genes is also low, including atfA and 
brpA (Figure 2A), which share BLAST homology to low G+C 
Gram-positive bacteria, mainly from the Phylum Firmicutes. 

The brpA gene was predicted to have different start codons 
using FGENESB depending on the settings used; the alternative 
start codon TTG (leucine) was predicted using "generic bacterial", 
resulting in a 232 amino acid protein. Given that a number of the 
proteins on SMG 6 shared 100% amino acid identity with 
Bacteroides thetaiotaomicron VPI-5482, it was also chosen as the 
closest organism for gene prediction and predicted an ATG 
(methionine) as the start codon, 1 1 7 base-pairs upstream of the 
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B . J 



I Tkb I 





b 








1 


}DrpA^ ! 






1^^^ 1 


(25) atfA JJ> 




(26) brpA 



-35 -10 RpaS 

GGCTTTGAAGTATTCGAGACTCTTTGCAGTATATCTGTCAATTGTT |lTATATTAb TAAGAGGATATTTTCTGATAGAGAGT 

-30 -7 

GAGAAGTCAAAATTTTTAAAGTTTGAGTAAACATGGCTCTATGTGTTCTTTTATTCAGTAGGTATTTGTGCCTTGTGCTAT 
OxyR -35 -10 



I ATGATTA| ATTTGCTAACAGGGTGCGGTGATACAACTGTAAAGTGGCTTGTTTCflTCflTGTTTCGTGATTAAGACACCATAT 
MINLLTGCGDTTVKWLVSSCFVIKTPY 
-33 -7 

TTTTGGTTTGTCCCTTTCTATTTAGGTTTGATTCTT'?-"'rTCTCCATATCTTTCAAAGGCTGTCGCAAACTTCAATAAACAG 
FWFVPFYLGLILLSPYLSKAVANFNKQ 



TMHMM posterior probat>ilide3 for WEBSEQUENCE 



o 




Figure 2. Overview of SiVIG 6 fosmid insert and features of specific genes. (A) Gene map of SMG 6 insert, displaying gene orientation and 
individual %G+C content indicated with a gradient colour bar. Gene numbers correspond to those in Table 2 and are drawn approximately to scale. 
(B) Focus on genes 25 (atfA) and 26 (brpA), showing the regions cloned for each construct. (C) Detailed view of putative ATG and TTG start codons of 
brpA, including upstream regions, as well as predicted promoter regions (highlighted in bold) and transcription factor binding site sequences (blue 
and orange boxes). (D) TMHMM prediction of seven transmembrane regions in BrpA. 
doi:1 0.1 371 /journal.pone.01 0331 8.g002 



predicted TTG start codon (encoding a protein of 271 amino 
acids). GeneMark was used for gene prediction as a comparison 
and it also predicted the same ATG as ttie start codon. A putative 
ribosome binding site (RBS) sequence (AGGTTT) was found 
ending seven base-pairs upstream of TTG, while a stronger RBS 
sequence (AGTAGG) ended 19 base-pairs upstream of the ATG 
start codon. Putative E. coli-type — 10 and —35 promoter regions 
were detected using BProm (www.softberry.com) upstream of both 
putative start codons. Manual inspection of upstream sequences 
also revealed the presence of a near perfect Bacteroidetes —7/ — 33 
promoter region (TAGGTTTG/TTTT; consensus TAnnTTTG/ 
TTTG) [43,44] upstream of the TTG start codon and a 
GGTATTTG/TTTT at -14/ -30 (GGTATTTG/TTTT) up- 
stream of ATG. The predicted promoter sequences along with 
putative transcription factor binding sites can be seen in 
Fignre 2C. A putative RpoS binding site is found upstream of 



the ATG start codon, while an OxyR binding sequence is 
predicted to be located upstream of the TTG start codon. 

The BrpA protein was predicted to be a 30.9 kDa membrane 
protein with seven transmembrane regions as predicted with 
TMHMM (Figure 2D). BrpA has a predicted pi of 9.42 and is 
composed of ~46% hydrophobic amino acids, similar to other 
microbial Brp/Blh proteins (pi range 8.89-9.56 and 48-56% 
hydrophobic amino acids) [24]. No signal peptide sequence, 
conserved domains or sequence motifs were detected for BrpA. 
We also searched for motifs in the protein sequences homologous 
to BrpA from BLAST. A lipocalin motif was detected in a 
hypothetical protein from Clostridium sp KLE-1755. Interestingly, 
lipocalin motifs are found in proteins that bind small hydrophobic 
molecules such as retinoids, carotenoids, lipids and steroids [45]. 
Table S2 shows the lipocalin motif and the corresponding motif 
identified in Clostridium sp KLE-1755. The BrpA amino acid 



PLOS ONE I www.plosone.org 



6 



July 2014 I Volume 9 | Issue 7 | e103318 



Metagenomic Identification of a Novel Salt Tolerance Gene 



sequence along with the top 1 0 BLAST homologues were ahgned 
to identify conserved residues in tliese proteins. The residues that 
match the lipocalin motif are displayed in green and those that do 
not are in red (Table S2). 

Due to low BLAST sequence identity, the FFAS03 server was 
used with the aim of identifying homologues to BrpA. The best 
homologues were an uncharacterised bacterial protein (COG 
3274; acyltransferase) and a predicted membrane protein (COG 
4763) with significant scores of —40.70 and —23.30 respectively. 
Interestingly the best hit homologue in the protein databank (PDB) 
was to an archaeal-type rhodopsin (3ug9), although the score of — 
9.43 did not reach significance (-9.50). 

The IMG/M-HMP database which contains all metagenomic 
datasets encompassing 1 7 body sites from the Human Microbiome 
Project (HMP) was also screened for BrpA homologies. Using a 
combination of the most lenient and strictest search criteria 
(maximum e-value cut-off of le-05 and le-50, respectively) BrpA 
homologues were identified in the HMP datasets (Figure 3). In 
addition, there were 145 hits to the MetaHit dataset using BLAST 
on the FFAS03 server. 

The brpA gene confers a salt tolerance phenotype when 
heterologously expressed in Escherichia coli 

The brpA gene (gene 26) was cloned from both predicted start 
codons and expressed in E. coli MKH13. Both fragments 
increased the salt tolerance of MKH13 significantly. Cells 
expressing the larger fragment {brpAi) had the most significant 
effect [P = 0.0002) in the presence of 3% NaCl. Although cells 
expressing the smaller fragment {brpAg) had a slower growth 
profile and a longer lag phase than the larger fragment {hrpAj), 
both exhibited a significant growth advantage compared to the E. 
coli MKH13 control harbouring the empty plasmid (pCI372) 
(P — 0.0039) (Figure 4B). The gene immediately upstream oibrpA 
is predicted to encode a 98 amino acid putative membrane protein 
(putative acyltransferase), which we have named atfA. The atfA 
gene was also cloned in combination with hrpA (prpAatfA). Both 
genes in combination did not increase the salt tolerance of 
MKH13 relative to brpA^ alone, when grown in LB+3% NaCl, 
but the increase in salt tolerance was significant (P = 0.0002) 
(Figure 4B). 

L-proline did not increase salt tolerance further 

Once we had shown that the brpA gene could confer a salt 
tolerance phenotype when expressed in E. coli, we aimed to 
decipher the mechanism of action and thereby assign a function to 
the encoded protein. Given that BLASTP analysis of the BrpA 
sequence revealed homology to a proline symporter, growth 
curves were carried out in minimal media supplemented with L- 
proline and also other common osmoprotectants, betaine and L- 
carnitine (final concentration of 1 mM). However, no growth 
advantage was seen in the presence of any of the added 
osmoprotectant compounds, suggesting that BrpA is not an 
osmoprotectant uptake system. 

Functional annotation of brpA 

BLASTP analysis also revealed that the BrpA protein exhibited 
homology to a brp/blh-iamily P-carotene 15,15'-monooxygenase. 
Such proteins are related to bacteriorhodopsins [24], and are 
annotated as bacterio-opsin related protein (brp)/brp-like homo- 
logue (blh) protein. Brp/Blh proteins have been shown to have IB- 
carotene 15,15'-monooxygenase activity; cleaving P-carotene into 
two molecules all-trans retinal (vitamin A aldehyde) [25]. The 
derived retinal is bound by a rhodopsin protein and cells 
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Figure 3. BrpA homologues identified when BLAST searched 
against Human iVIicrobiome Project (HMP) datasets from 17 
body sites at maximum e-value cut-off of (A) 1e^° and (B) 
leA 

doi:1 0.1 371/journal.pone.01 0331 8.g003 

expressing such proteins acquire an orange/red colour, indicative 
of the presence of retinal in the cell membrane [46-48] . Strains 
harbouring brpA were grown in the presence of P-carotene and 
cell pellets were observed for the development of the characteristic 
red/orange colour. E. coli MKH13 cells carrying the brpA gene 
on the pCI372 plasmid did not show any obvious colour 
development, most likely due to the fact that pCI372 is not 
inducible (Figure 5A). 

Given that a number of previous studies have reported a 
requirement for the use of an inducible vector to visualise 
pigmentation in cell pellets [46-50], we cultured the original 
fosmid clones (which can be induced due to Copy Control 
capability of pCClFOS fosmid vector) in the presence of P- 
carotene and included an induction solution to induce the fosmid 
from low to high-copy number. The cell pellets developed an 
intense red/ orange colour while cells with an empty vector did not 
(Figure 5B). To confirm that the BrpA protein was responsible for 
this phenotype, we cloned brpA in isolation into the pBAD 
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Figure 4. Growth of £ co//MKH13::pCI372 and £ co// MKH13 carrying a plasmid encoded copy of eitKier brpAu brpAs or brpAatfA in 
(A) LB brotKi or (B) LB+3% NaCI. All of the genes confer a significant salt tolerance phenotype to MKH13 relative to cells with an empty plasmid 
vector. All values are the average of triplicate experiments and error bars are representative of the standard error of the mean (SEIVI). 
doi:1 0.1 371 /journal.pone.01 0331 8.g004 



inducible expression vector and transformed it into E. coli EPI300 
and repeated the growth experiments. Again, the cell pellets 
developed a distinctive a red/orange colour (Figure 5C). 



brpA also confers salt tolerance to E. coli EPI300 

The genes [hrpAi^, hrpAs and brpAatfA) were also cloned into 
the pBAD expression vector and transformed into E. coli EPI300. 
AU of the transformed strains exhibited increased salt tolerance 
relative to the host containing the empty pBAD vector, although 



Figure 5. Pigmentation observed in cell pellets. (A) Appearance of cell pellets grown in LB supplemented with p-carotene. From left to right: £ 
coli MKH13::pCI372, MKH1 3::pCI372-brp/\t, IVIKH13::pCI372-brp/\s and MKH1 3::pCI372-fa/-p/1afM. (B) Appearance of cell pellets of clones grown in LB 
supplemented with p-carotene and Copy Control Induction solution (L-arabinose). From left to right: f. coli EPI300::pCC1FOS, SIVIG 1, SMG 6 and SMG 
52. (C) Appearance of cell pellets grown in LB supplemented with p-carotene and L-arabinose. From left to right: £ coli EPI300::pBAD, EPI300::pBAD- 
brpAs, EPi300::pBAD-brp/lt, and EPI300::pBAD-brp/\afM. 
doi:1 0.1 371/journal.pone.01 0331 B.gOOS 
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EPI300::pBAD-&rjf)/4s to a lesser extent, similar to our observa- 
tions with MKH13 above (Figure SI). 

Effect of p-carotene on survival in high-salt media 

The effect of P-carotene on survival of both £. coli MKH13 and 
EPI300 strains was assessed. Survival of strains carrying a plasmid- 
encoded copy of hrpA was compared to controls (carrying an 
empty plasmid) in high-salt media (3% NaCl for MKH13 and 7% 
for EPI300) in the presence and absence of P-carotene after a 48- 
hour period, both aerobicaUy and anaerobicaUy (Figure S2). B- 
carotene did not provide an osmoprotective effect during salt stress 
to control strains or strains carrying a copy of the hrpA gene under 
the conditions tested, however an increased salt tolerance 
phenotype was observed under both aerobic and anaerobic 
conditions. 

Transposon mutagenesis 

Transposon mutagenesis was performed using the EZTn5 
in vitro transposition system (Epicentre Biotechnologies) to create 
knock-out mutants of SMG 6. Clones harbouring a transposon 
insertion in the brpA and neighbouring genes were identified by 
PGR. The primer pair hrpAatfA FP and RP were used to amplify 
this region, generating PGR products of ~ 1 .4 kb in the absence of 
a transposon insertion and products of ~3.3 kb if the transposon 
was present (Figure 6A). Once positive clones were identified, the 
location of the transposon was confirmed by sequencing from the 
ends of the transposon. We identified four transposon mutants in 
SMG 6; namely 6-EZTn #24, #26, #34 and #38. The location 
of the transposon insertions are presented in Figure 6B. The aim 
was to identify clones that lacked pigmentation following 
transposition. Glones containing a transposon insertion do not 
display the same intense red pigmentation seen with SMG 6 and 
although there is visibly less pigmentation, some residual colour 
nevertheless remains (Figure 6C). 

Discussion 

In the current study we have identified and characterised a 
novel salt tolerance locus from the human gut microbiome. 
Functional assignment of its encoded protein, BrpA, using BLAST 
returned homologues mainly annotated as hypothetical or putative 
membrane proteins. The only clue to the possible function of the 
protein was that it also shared sequence similarity (albeit at <30%) 



to a proline symporter and a 6rp/6/A -family P-carotene 15,15'- 
monooxygenase. Sequence homologies of less than 30% are 
considered to be in the "twilight zone" and confidence of 
functional annotations diminishes below this threshold [51,52]. 
Nevertheless, we felt it was worth investigating this gene further as 
proline is a well-known compound utilised by bacteria as an 
osmoprotectant when exposed to osmotic stress. 

Growth experiments in minimal media supplemented with L- 
proline and other osmoprotectants had no effect on growth or salt 
tolerance. The gene, which we have termed hrpA, possibly 
encodes a putative fer^/WA-family P-carotene 15,15'-monooxygen- 
ase. Such proteins have been shown to catalyse the conversion of 
P-carotene into two molecules of all-trans retinal (vitamin A 
aldehyde) (Figure 7A) [24,25]. Growth of the metagenomic clone 
SMG 6 in the presence of exogenous P-carotene resulted in the cell 
pellets with a distinctive orange/ red colour. A number of other 
studies have shown that bacterial cells expressing plasmid encoded 
P-carotene biosynthesis genes in addition to a hrp/blh gene and a 
proteorhodopsin (PR) encoding gene adopt a similar colour due to 
the cleavage of P-carotene to retinal and subsequent binding of 
retinal by proteorhodopsins in the cell membrane [46-49]. The 
absence of any obvious PR encoding gene on SMG 6 therefore, 
does not explain the presence of colour in the SMG clones' cell 
pellet. Furthermore, when bprA was cloned in isolation the cell 
pellets stiU had pigmentation, indicating that brpA alone is 
sufficient to confer this phenotype. There are however a few 
possible explanations for the pigmentation; in silico analysis 
reveals that brpA is predicted to have acyltransferase activity 
(GOG 3274), as is atfA, the gene immediately upstream othpA. 
The atfA gene was cloned in combination with brpA, however 
expression of both genes together had no appreciable effect on the 
degree of pigmentation or salt tolerance observed. Carotenoids 
and retinoids are hydrophobic, lipophilic molecules. The majority 
of carotenoids are found embedded in the hydrophobic core of 
lipid membranes and in lipid globules and other hydrophobic 
environments [53,54]. Acylated carotenoids have been shown to 
be inserted in the membrane and the predicted acyltransferase 
activity of BrpA may explain the cell pellet pigmentation in the 
absence of a rhodopsin protein [55]. In Staphylococcus aureus, an 
acyltransferase is a key enzyme in the biosynthesis pathway for the 
orange carotenoid staphyloxanthin [56], This enzyme was initially 
thought to carry out the final step in staphyloxanthin biosynthesis, 
although more recently it has been shown that it is actually the 




B 
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Figure 6. EZTn5 transposon mutagenesis of SMG 6 was performed to identify mutants lacl<ing pigmentation when grown in the 
presence of p-carotene. (A) Clones positive for a transposon in this region of SMG 6 fosmid insert were identified by PCR, w\th amplicons of 
—3.3 kb indicative of an insertion event. (B) Approximate locations of transposon insertions in relation to brpA and neighbouring genes. (C) 
Appearance of cell pellets of SMG 6 and transposon insertion mutants (EZTn #24, #26, #34 and #38) following growth in the presence of |3- 
carotene. 

doi:1 0.1 371 /journal.pone.01 0331 8.g006 
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Figure 7. Possible mechanism(s) of action of BrpA (A) Representation of the known reaction for the formation of retinal. B-carotene is 
cleaved at its central 15,15' bond by brp 15,15'- p-carotene monooxygenase to form two molecules of -all-trans retinal (Vitamin A aldehyde). We 
propose that brpA may be regulated from two promoters, with translation being initiated from one of two potential start codons (ATG and TTG), 
depending on environmental conditions. While speculative, we illustrate some possibilities discussed in the text. (B) Pigmentation phenotype: 
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regulation of brpA from promoter 1 (upstream of ATG start codon) under "normal" cellular conditions, or possibly by p-carotene, could result in (B1) 
BrpA adding an acyl group to p-carotene, allowing it to interact with phosphate head groups of lipids and anchoring it in the hydrophobic core of the 
lipid membrane or (B2) BrpA may cleave p-carotene to retinal and subsequently bind the derived retinal anchoring it in the cell membrane. (C) Stress 
response: regulation of brpA from promoter 2 (upstream of TTG start codon), may be initiated by environmental signals such as changes in external 
osmolarity, resulting in increased tolerance or resistance to environmental stress, such as increased NaCI concentrations by an as yet unknown 
mechanism. Alternative start codons, such as TTG, have been found in a number of stress response genes. 
doi:l 0.1 371 /journal.pone.01 0331 8.g007 



penultimate step [5 7] . The transfer of a polar acyl group or acyl- 
containing groups such as hydroxy! or kc-to groups to carotenoids 
would be likely to enable their interaction with phosphate head 
groups of lipids, thus anchoring them within membranes [55,58]. 

The presence of a lipocalin motif was identified in a BLAST 
homologue of BrpA. Lipocalin proteins can bind hydrophobic 
molecules such as carotenoids and retinoids. It seems unlikely 
however, that this is the case with BrpA since the motif is quite 
different and lacks the characteristic glycine-X-tryptophan (G-X- 
W) signature found in almost all lipocalins [59] . The BrpA protein 
has seven predicted transmembrane regions, a characteristic 
shared with rhodopsin proteins [60]. It has previously been 
suggested that Brp/Blh-hke proteins may be multifunctional and 
both cleave P-carotene and subsequentiy transport or bind the 
derived all-trans retinal, although this has not been demonstrated 
experimentally [25]. 

Four transposon mutants of SMG 6 were identified in this study 
using PGR. It was expected to obtain mutants that lack 
pigmentation when grown in the presence of P-carotene. While 
there is a clear visible difference in the appearance of the cell 
pellets of the mutants compared to SMG 6, each of the mutants 
retain some level of pigmentation, albeit to a lesser degree and 
with diminished colour intensity. Transposon insertion in genes 
upstream oibrpA (mutants #24 and #34) indicates a polar effect 
mediating the reduction in the degree of pigmentation. It is 
surprising that some pigmentation remains in clones containing a 
transposon within the hrpA gene (mutants #26 and #38), 
indicating residual carotenoid accumulation, possibly due to 
acyltransferase activity otatfA. 

The % G+C content of individual genes on SMG 6 drops as low 
as 30.64% for gene 25 (atfA), while its neighbouring gene, brpA, is 
32.05%. In addition, only 12% of the top 100 BLASTP hits to 
BrpA are predicted to be from Gram-negative bacteria. The 
remaining 88% are represented in the main by proteins with 
similarity to the low G+C, Gram-positive Firmicutes phylum, 
mainly from the genera of Clostridium, Enterococcus and 
Streptococcus among others. Taken together, these observations 
suggest much of this region, including the especially low % G+C, 
atfA and hrpA genes, were acquired through a LGT event [61,62]. 
Indeed, in support of this there is evidence that brp/blh-Xype genes, 
along with rhodopsins, undergo frequent LGT events [46,63-65] . 
In P-carotene producing bacteria, only these two genes are 
required to produce retinal which is bound to the rhodopsin 
protein giving the recipient bacterium the ability to har\'est light 
energy non-photosynthetically and convert it to chemical energy. 
Acquiring a rhodopsin gene in the gut would be somewhat 
redundant owning to the aphotic nature of the gut environment. A 
brp/blh P-carotene monooxygenase however could be beneficial to 
break down dietary-derived P-carotene. 

There were two possible start codons predicted for the hrpA 
gene using the FGENESB gene prediction program. Using the 
"bacterial generic" parameter as closest organism, a gene (brpAg) 
encoding a 232 amino acid protein with the alternative initiation 
codon TTG (leucine) was predicted. Because a number of proteins 
encoded on SMG 6 shared 100% amino acid identity with 
Bacteroides thetaiotaomicron VPI-5482, this organism was also 



used as the "closest organism" parameter. Using B. iheAaiolaomi- 
cron VPI-5482 as "closest organism" predicted a gene [hrpA]) 
encoding a 271 amino acid protein with an ATG (methionine) 
start codon. GeneMark also predicted ATG to be the start codon. 
Cloning and expression of the gene from both predicted start 
codons conferred salt tolerance to E. coli, although strains 
expressing the hrpAj^ fragment had a shorter lag phase and 
reached a higher final OD. Initially, it seemed likely that ATG was 
the true start codon otbrpA, however further manual inspection of 
the serjuences upstream of both start codons revealed a 
characteristic Bacteroides —7/ — 33 promoter region preceding 
the TTG codon that deviated from the consensus by only one 
nucleotide. There is also a potential Bacteroides-type promoter 
upstream of ATG, but at position -14/ -30 (GGTATTTG/ 
TTTT). It therefore seems likely that TTG is the actual start 
codon in Bacteroides. Interestingly, previous studies have shown 
that the use of alternative initiation codons, other than ATG, is a 
common feature of osmotolerance genes in a number of 
gastrointestinal pathogens [12,17]. The increased salt tolerance 
phenotype of brpAj^ compared to brpAs may be due to the fact 
that ATG is the most commonly utilised codon to initiate 
translation (~90% of genes) in E. coli [66] and also the presence 
of strong RBS (AGUAGGU) upstream of the ATG start codon, 
which differs from the E. coli consensus RBS (AGGAGGU) by 
only one nucleotide. Taken together, the ATG start codon and 
strong E. coli RBS likely gives rise to more efficient levels of 
transcription and translation, as well as increased expression of 
hrpA in E. coli, at least under the conditions tested in the current 
study. It is of course possible that the two protein types (long and 
short) are expressed under different environmental conditions, as 
was previously reported for the multi-stress resistance locus HtrA 
[67]. 

The presence of a putative RpoS binding site is predicted 
upstream of the ATG start codon oibrpA. The alternative sigma 
factor (sigma 38) RpoS is the master regulator of the general stress 
response induced during stationary phase in E. coli and other 
Gram-negative bacteria [68]. In addition RpoS regulates the 
expression of a large number of genes in response to various 
stresses, including salt stress [69—71]. There is also a putative 
OxyR binding site in the upstream region of brpA. OxyR is a 
regulator of the oxidative stress response in many bacteria [72] 
and carotenoids can function as anti-oxidants and can increase 
resistance to oxidative stress [73,74]. It is possible that the brpA 
gene is transcribed from two promoters under different environ- 
mental conditions, similar to the type of regulation seen with the 
osmoprotectant transporter ProP in E. coli, where theproP gene is 
transcribed from promoter 1 (PI) primarily in response to changes 
in osmolarity and from promoter 2 (P2) during stationary phase 
[75,76]. 

The BrpA amino acid sequence was used to BLAST search 
against all metagenomes from the HMP dataset at the lowest 
(le-°^) and highest (le-^°) e-value. Hits to BrpA were most 
abundant in the stool, supra-gingival plaque and tongue 
metagenome samples at the lowest e-value (Figure 3B). The 
majority of these hits had quite low percentage identities in the 
range of 25%-35%. When the e-value cut-off was increased to 
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le- '" only 13 putative BrpA homologues were identified and only 
from the stool metagenome samples (Figure 3A) and would 
therefore appear to be a rare gene found in some strains of 
Bacteroides thetaiotaomicron, which is one of the most abundant 
species in the human gut microbiome, having been shown to 
comprise 6% of all bacteria among the human gut microbiota 
[77]. It is interesting that homologues of this gene are found most 
abundantly in body sites (tongue, sub- and supragingival plaque 
and gut lumen/stool) where the microbiota would encounter P- 
carotene (i.e. from dietary sources). 

Carotenoids have been shown to protect cells from various 
environmental stresses such as osmotic, oxidative and light as well 
as reinforcing and providing increased membrane rigidity 
[54,74,78-80]. In this study, (3-carotene however did not provide 
any further increase in salt tolerance under the conditions tested 
and therefore does not appear to function in an osmoprotective 
capacity. Acyltransferase enzymes have also been linked to various 
stress responses, including osmotic stress. For example, the 
acyltransferase HtrB, provides protection against and exhibits 
increased expression in response to heat, acid, oxidative and 
osmotic stress in Campylobacter jejuni and Salmonella typhimur- 
ium [81], while acyltransferases have also been linked to the stress 
response in Pseudomonas pulida [82]. 

In the current study we have used a combined functional 
metagenomic and bioinformatic approach to identily a novel gene 
from the human gut microbiome that has not previously been 
Unked to salt tolerance. The gene, brpA, encodes a protein with 
homology to a brp/hlh-tamily fi-carotene 15,15'-monooxygenase. 
When expressed in E. coli, BrpA confers salt tolerance phenotype 
and cell pellets adopt a red/ orange pigmentation when grown in 
the presence of exogenous P-carotene. 

Supporting Information 

Figure SI Growth of E. coli EPI300::pBAD and 
EPI300::pBAD-6»yAs (P= 0.0008), EPI300::pBAD-fe»y4i 
(P= 0.0002) and EPI300::pBAD-&>7^a(/:4 (P= 0.0001) in 
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